83 research outputs found

    A Non-Negative Matrix Tri-Factorization Based Method for Predicting Antitumor Drug Sensitivity

    Get PDF
    Large annotated cell line collections have been proven to enable the prediction of drug response in the pre-clinical setting. We present an enhancement of Non-Negative Matrix Tri-Factorization method, which allows the integration of different data types for the prediction of missing associations. To test our method we retrieved a dataset from the Cancer Cell Line Encyclopedia (CCLE), containing the connections among cell lines and drugs by means of their IC50 values, and we integrated it by linking cell lines to their respective tissue of origin and genomic profile. We performed two different kind of experiments: a) prediction of missing values in the matrix, b) prediction of the complete drug profile of a new cell line, demonstrating the validity of the method in both scenarios

    Computational algorithms to predict Gene Ontology annotations

    Get PDF
    Background Gene function annotations, which are associations between a gene and a term of a controlled vocabulary describing gene functional features, are of paramount importance in modern biology. Datasets of these annotations, such as the ones provided by the Gene Ontology Consortium, are used to design novel biological experiments and interpret their results. Despite their importance, these sources of information have some known issues. They are incomplete, since biological knowledge is far from being definitive and it rapidly evolves, and some erroneous annotations may be present. Since the curation process of novel annotations is a costly procedure, both in economical and time terms, computational tools that can reliably predict likely annotations, and thus quicken the discovery of new gene annotations, are very useful. Methods We used a set of computational algorithms and weighting schemes to infer novel gene annotations from a set of known ones. We used the latent semantic analysis approach, implementing two popular algorithms (Latent Semantic Indexing and Probabilistic Latent Semantic Analysis) and propose a novel method, the Semantic IMproved Latent Semantic Analysis, which adds a clustering step on the set of considered genes. Furthermore, we propose the improvement of these algorithms by weighting the annotations in the input set. Results We tested our methods and their weighted variants on the Gene Ontology annotation sets of three model organism genes (Bos taurus, Danio rerio and Drosophila melanogaster ). The methods showed their ability in predicting novel gene annotations and the weighting procedures demonstrated to lead to a valuable improvement, although the obtained results vary according to the dimension of the input annotation set and the considered algorithm. Conclusions Out of the three considered methods, the Semantic IMproved Latent Semantic Analysis is the one that provides better results. In particular, when coupled with a proper weighting policy, it is able to predict a significant number of novel annotations, demonstrating to actually be a helpful tool in supporting scientists in the curation process of gene functional annotations

    Identifying collateral and synthetic lethal vulnerabilities within the DNA-damage response.

    Get PDF
    BackgroundA pair of genes is defined as synthetically lethal if defects on both cause the death of the cell but a defect in only one of the two is compatible with cell viability. Ideally, if A and B are two synthetic lethal genes, inhibiting B should kill cancer cells with a defect on A, and should have no effects on normal cells. Thus, synthetic lethality can be exploited for highly selective cancer therapies, which need to exploit differences between normal and cancer cells.ResultsIn this paper, we present a new method for predicting synthetic lethal (SL) gene pairs. As neighbouring genes in the genome have highly correlated profiles of copy number variations (CNAs), our method clusters proximal genes with a similar CNA profile, then predicts mutually exclusive group pairs, and finally identifies the SL gene pairs within each group pairs. For mutual-exclusion testing we use a graph-based method which takes into account the mutation frequencies of different subjects and genes. We use two different methods for selecting the pair of SL genes; the first is based on the gene essentiality measured in various conditions by means of the "Gene Activity Ranking Profile" GARP score; the second leverages the annotations of gene to biological pathways.ConclusionsThis method is unique among current SL prediction approaches, it reduces false-positive SL predictions compared to previous methods, and it allows establishing explicit collateral lethality relationship of gene pairs within mutually exclusive group pairs

    MCTK: a Multi-modal Conversational Troubleshooting Kit for supporting users in web applications

    Get PDF
    Conversational Interfaces for user assistance are becoming persuasive. Today, though, most chatbots are not integrated into the application in which they are placed, but only superimposed, with no communication between the conversational and the graphical interface. We propose Multi-modal Conversational Troubleshooting Kit (MCTK), a Python package to easily integrate a conversational agent for troubleshooting in web applications. MCTK is multi-modal: once the system recognizes the problem the user is encountering, the textual solution in the chat is coupled with visual hints in the GUI. On top of that, MCTK is easy to configure and offers separation of concerns: dialogue designers can work on the conversation without the necessity of modifying the code, and vice versa

    Large-diameter carbon-composite monofilaments

    Get PDF
    Large-diameter carbon composite monofilaments with high strength and high modulus were produced by pregging multifiber carbon bundles with suitable organic resins and pyrolysing them together. Two approaches were developed to increase the utilization of fiber tensile strength by minimizing stress concentration defects induced by dissimilar shrinkage during pyrolysis. These were matrix modification to improve char yield and strain-to-failure and fiber-matrix copyrolysis to alleviate matrix cracking. Highest tensile strength and modulus were obtained by heat treatments to 2873 K to match fiber and matrix strain-to-failure and develop maximum monofilament tensile-strength and elastic modulus

    General Adaptive Neighborhood Image Restoration, Enhancement and Segmentation

    Get PDF
    12 pagesInternational audienceThis paper aims to outline the General Adaptive Neighborhood Image Processing (GANIP) approach [1–3], which has been recently introduced. An intensity image is represented with a set of local neighborhoods defined for each point of the image to be studied. These so-called General Adaptive Neighborhoods (GANs) are simultaneously adaptive with the spatial structures, the analyzing scales and the physical settings of the image to be addressed and/or the human visual system. After a brief theoretical introductory survey, the GANIP approach will be successfully applied on real application examples in image restoration, enhancement and segmentation

    GenoMetric Query Language: A novel approach to large-scale genomic data management

    Get PDF
    Motivation: Improvement of sequencing technologies and data processing pipelines is rapidly providing sequencing data, with associated high-level features, of many individual genomes in multiple biological and clinical conditions. They allow for data-driven genomic, transcriptomic and epigenomic characterizations, but require state-of-the-art ‘big data’ computing strategies, with abstraction levels beyond available tool capabilities. Results: We propose a high-level, declarative GenoMetric Query Language (GMQL) and a toolkit for its use. GMQL operates downstream of raw data preprocessing pipelines and supports queries over thousands of heterogeneous datasets and samples; as such it is key to genomic ‘big data’ analysis. GMQL leverages a simple data model that provides both abstractions of genomic region data and associated experimental, biological and clinical metadata and interoperability between many data formats. Based on Hadoop framework and Apache Pig platform, GMQL ensures high scalability, expressivity, flexibility and simplicity of use, as demonstrated by several biological query examples on ENCODE and TCGA datasets. Availability and implementation: The GMQL toolkit is freely available for non-commercial use at http://www.bioinformatics.deib.polimi.it/GMQL/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online

    DeepBrain: Functional Representation of Neural In-Situ Hybridization Images for Gene Ontology Classification Using Deep Convolutional Autoencoders

    Full text link
    This paper presents a novel deep learning-based method for learning a functional representation of mammalian neural images. The method uses a deep convolutional denoising autoencoder (CDAE) for generating an invariant, compact representation of in situ hybridization (ISH) images. While most existing methods for bio-imaging analysis were not developed to handle images with highly complex anatomical structures, the results presented in this paper show that functional representation extracted by CDAE can help learn features of functional gene ontology categories for their classification in a highly accurate manner. Using this CDAE representation, our method outperforms the previous state-of-the-art classification rate, by improving the average AUC from 0.92 to 0.98, i.e., achieving 75% reduction in error. The method operates on input images that were downsampled significantly with respect to the original ones to make it computationally feasible
    corecore